Lecture 1

Introduction

Sara Martino

Dept. of Mathematical Science, NTNU

Introduction

What is INLA? What is inlabru?

The short answer:

INLA is a fast method to do Bayesian inference with latent Gaussian models and inlabru is an R-package that implements this method with a flexible and simple interface.

The (much) longer answer:

cite some papers here

Where?

The software, information, examples and help can be found at

cite github for inlabru

  • paper
  • tutorials
  • discussion group

So… Why should you use inlabru?

  • What type of problems can we solve?
  • What type of models can we use?
  • When can we use it?

So… Why should you use inlabru?

  • What type of problems can we solve?
  • What type of models can we use?
  • When can we use it?

To give proper answers to these questions, we need to start at the very beginning ..

The core

  • We have observed something.

The core

  • We have observed something.
  • We have questions.

The core

  • We have observed something.
  • We have questions.
  • We want answers!

How do we find answers?

We need to make choices:

  • Bayesian or frequentist?
  • How do we model the data?
  • How do we compute the answer?

How do we find answers?

We need to make choices:

  • Bayesian or frequentist?
  • How do we model the data?
  • How do we compute the answer?

These questions are not independent.

Bayesian or frequentist?

In this course we embrace the Bayesian perspective

  • There are no “true but unknown” parameters !

Bayesian or frequentist?

In this course we embrace the Bayesian perspective

  • There are no “true but unknown” parameters !
  • Every parameter is described by a probability distribution!

Bayesian or frequentist?

In this course we embrace the Bayesian perspective

  • There are no “true but unknown” parameters !
  • Every parameter is described by a probability distribution!
  • Evidence from the data is used to update the belief we had before observing the data!

Let’s formalize this a bit…

The elements of a inlabru friendly statistical model are:

  1. The observational model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim\mathcal{N}(\eta_i,\sigma^2),\qquad i = 1,\dots,n\\ E(y_i|\eta_i, \sigma^2) & = \eta_i \end{aligned} \] Note: We assume that, given the linear predictor \(\eta\), the data are independent on each other! Data dependence is expressed through the components if the linear predictor.

Let’s formalize this a bit…

The elements of a inlabru friendly statistical model are:

  1. The observational model \(y_i|\eta_i,\sigma^2\sim\mathcal{N}(\eta_i,\sigma^2),\qquad i = 1,\dots,n\)

  2. A model for the linear predictor \[ E(y_i|\eta_i,\sigma^2) = \eta_i = \beta_0 + \beta_1x_i \]

Let’s formalize this a bit…

The elements of a inlabru friendly statistical model are:

  1. The observational model \(y_i|\eta_i,\sigma^2\sim\mathcal{N}(\eta_i,\sigma^2),\qquad i = 1,\dots,n\)

  2. A model for the linear predictor

\[ E(y_i|\eta_i,\sigma^2) = \eta_i = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{\beta_1x_i} } \]

Note 1: These are the components of our model! These explain the dependence structure of the data.

Let’s formalize this a bit…

The elements of a inlabru friendly statistical model are:

  1. The observational model \(y_i|\eta_i,\sigma^2\sim\mathcal{N}(\eta_i,\sigma^2),\qquad i = 1,\dots,n\)

  2. A model for the linear predictor \(\eta_i = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{\beta_1x_i} }\)

  3. A prior for the model components \(\textbf{u}\) \[ \mathbf{u} = \{\beta_0, \beta_1\}\sim\mathcal{N}(0,\mathbf{Q}^{-1}) \] Note: These have always a Gaussian prior and are use to explain the dependencies among data!

Let’s formalize this a bit…

The elements of a inlabru friendly statistical model are:

  1. The observational model \(y_i|\eta_i,\sigma^2\sim\mathcal{N}(\eta_i,\sigma^2),\qquad i = 1,\dots,n\)

  2. A model for the linear predictor \(\eta_i = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{\beta_1x_i} }\)

  3. A prior for the model components \(\mathbf{u} = \{\beta_0, \beta_1\}\sim\mathcal{N}(0,\mathbf{Q}^{-1})\)

  4. A prior for the non-gaussian parameters \(\theta\) \[ \theta = \sigma^2 \]

Latent Gaussian Models (LGM)

  1. The observation model: \[ \pi(\mathbf{y}|\eta,\theta) = \prod_{i=1}^{n}\pi(y_i|\eta_i,\theta) \]

  2. Linear predictor \(\eta_i = \beta_0 = \beta_1 x_i\)

  3. Latent Gaussian field \(\pi(\mathbf{u}|\theta)\)

  4. The hyperparameters: \(\pi(\theta)\)

  • Stage 1 The data generating process

Latent Gaussian Models (LGM)

  1. The observation model: \[ \pi(\mathbf{y}|\eta,\theta) = \prod_{i=1}^{n}\pi(y_i|\eta_i,\theta) \]

  2. Linear predictor \(\eta_i = \beta_0 = \beta_1 x_i\)

  3. Latent Gaussian field \(\pi(\mathbf{u}|\theta)\)

  4. The hyperparameters: \(\pi(\theta)\)

  • Stage 1 The data generating process

  • Stage 2 The dependence structure

Latent Gaussian Models (LGM)

  1. The observation model: \[ \pi(\mathbf{y}|\eta,\theta) = \prod_{i=1}^{n}\pi(y_i|\eta_i,\theta) \]

  2. Linear predictor \(\eta_i = \beta_0 = \beta_1 x_i\)

  3. Latent Gaussian field \(\pi(\mathbf{u}|\theta)\)

  4. The hyperparameters: \(\pi(\theta)\)

  • Stage 1 The data generating process

  • Stage 2 The dependence structure

  • Stage 3 The hyperparameters

Latent Gaussian Models (LGM)

  1. The observation model: \[ \pi(\mathbf{y}|\eta,\theta) = \prod_{i=1}^{n}\pi(y_i|\eta_i,\theta) \]

  2. Linear predictor \(\eta_i = \beta_0 = \beta_1 x_i\)

  3. Latent Gaussian field \(\pi(\mathbf{u}|\theta)\)

  4. The hyperparameters: \(\pi(\theta)\)

  • Stage 1 The data generating process

  • Stage 2 The dependence structure

  • Stage 3 The hyperparameters

Q: What are we interested in?

\[ \pi(\mathbf{u},\theta|\mathbf{y})\propto \pi(\mathbf{y}|\mathbf{u},\theta)\pi(\mathbf{u}|\theta)\pi(\theta) \]

Here is where inlabru comes in 😃

inlabru for linear regression

The Model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \beta_0 + \beta_i x_i \end{aligned} \]

library(inlabru)
bodyfat[1:3,c("Abdomen","Bodyfat")]
  Abdomen Bodyfat
1    85.2    12.3
2    83.0     6.1
3    87.9    25.3

The code

# define model components
cmp =  ~ -1 + beta0(1) + beta1(Abdomen, model = "linear")

# define model predictor
eta = Bodyfat ~ beta0 + beta1

# build the observation model
lik = bru_obs(formula = eta,
              family = "gaussian",
              data = bodyfat)

# fit the model
fit = bru(cmp, lik)

inlabru for linear regression

The Model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{\beta_i x_i}} \end{aligned} \]

bodyfat[1:3,c("Abdomen","Bodyfat")]
  Abdomen Bodyfat
1    85.2    12.3
2    83.0     6.1
3    87.9    25.3

The code

# define model components
cmp =  ~ -1 + beta0(1) + beta1(Abdomen, model = "linear")

# define model predictor
eta = Bodyfat ~ beta0 + beta1

# build the observation model
lik = bru_obs(formula = eta,
              family = "gaussian",
              data = bodyfat)

# fit the model
fit = bru(cmp, lik)

inlabru for linear regression

The Model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \color{red}{\boxed{\beta_0 + \beta_i x_i}} \end{aligned} \]

library(inlabru)
bodyfat[1:3,c("Abdomen","Bodyfat")]
  Abdomen Bodyfat
1    85.2    12.3
2    83.0     6.1
3    87.9    25.3

The code

# define model components
cmp =  ~ -1 + beta0(1) + beta1(Abdomen, model = "linear")

# define model predictor
eta = Bodyfat ~ beta0 + beta1

# build the observation model
lik = bru_obs(formula = eta,
              family = "gaussian",
              data = bodyfat)

# fit the model
fit = bru(cmp, lik)

inlabru for linear regression

The Model \[ \begin{aligned} \color{red}{\boxed{y_i|\eta_i, \sigma^2}} & \color{red}{\boxed{\sim \mathcal{N}(\eta_i,\sigma^2)}}\\ \eta_i & = \beta_0 + \beta_i x_i \end{aligned} \]

library(inlabru)
bodyfat[1:3,c("Abdomen","Bodyfat")]
  Abdomen Bodyfat
1    85.2    12.3
2    83.0     6.1
3    87.9    25.3

The code

# define model components
cmp =  ~ -1 + beta0(1) + beta1(Abdomen, model = "linear")

# define model predictor
eta = Bodyfat ~ beta0 + beta1

# build the observation model
lik = bru_obs(formula = eta,
              family = "gaussian",
              data = bodyfat)

# fit the model
fit = bru(cmp, lik)

inlabru for linear regression

The Model \[ \begin{aligned} y_i|\eta_i, \sigma^2 & \sim \mathcal{N}(\eta_i,\sigma^2)\\ \eta_i & = \beta_0 + \beta_i x_i \end{aligned} \]

library(inlabru)
bodyfat[1:3,c("Abdomen","Bodyfat")]
  Abdomen Bodyfat
1    85.2    12.3
2    83.0     6.1
3    87.9    25.3

The code

# define model components
cmp =  ~ -1 + beta0(1) + beta1(Abdomen, model = "linear")

# define model predictor
eta = Bodyfat ~ beta0 + beta1

# build the observation model
lik = bru_obs(formula = eta,
              family = "gaussian",
              data = bodyfat)

# fit the model
fit = bru(cmp, lik)

inlabru for linear regression

Real-world datasets are more complicated!

Data can have several dependence structures: temporal, spatial,…

Using a Bayesian framework:

  • Build (hierarchical) models to account for potentially complicated dependency structures in the data.

  • Attribute uncertainty to model parameters and latent variables using priors.

Two main challenges:

  • Need computationally efficient methods to calculate posteriors (this is where INLA helps!).
  • Select priors in a sensible way (we’ll talk about this)

The good news!!

In many cases complicated spatio-temporal models are just special cases of the same model structure!! 😃

  • Stage 1: What is the distribution of the responses?

  • Stage 2: What are the model components? and what is their distribution?

  • Stage 3: What are our prior beliefs about the parameters controlling the components in the model?

The good news!!

In many cases complicated spatio-temporal models are just special cases of the same model structure!! 😃

  • Stage 1: What is the distribution of the responses?

    • Gaussian response? (temperature, rainfall, fish weight …)
    • Count data? (people infected with a disease in each area)
    • Point pattern? (locations of trees in a forest)
    • Binary data? (yes/no response, binary image)
    • Survival data? (recovery time, time to death)
  • Stage 2: What are the model components? and what is their distribution?

  • Stage 3: What are our prior beliefs about the parameters controlling the components in the model?

The good news!!

In many cases complicated spatio-temporal models are just special cases of the same model structure!! 😃

  • Stage 1: What is the distribution of the responses?

    • We assume data to be conditionally independent given the model components and some hyperparameters
    • This means that all dependencies in data are explained in Stage 2.
  • Stage 2: What are the model components? and what is their distribution?

  • Stage 3: What are our prior beliefs about the parameters controlling the components in the model?

The good news!!

In many cases complicated spatio-temporal models are just special cases of the same model structure!! 😃

  • Stage 1: What is the distribution of the responses?

  • Stage 2: What are the model components? and what is their distribution?

Here we can have:

  • Fixed effects for covariates
  • Unstructured random effects (individual effects, group effects)
  • Structured random effects (AR(1), regional effects, )

These are linked to the responses in the likelihood through linear predictors.

  • Stage 3: What are our prior beliefs about the parameters controlling the components in the model?

The good news!!

In many cases complicated spatio-temporal models are just special cases of the same model structure!! 😃

  • Stage 1: What is the distribution of the responses?

  • Stage 2: What are the model components? and what is their distribution?

  • Stage 3: What are our prior beliefs about the parameters controlling the components in the model?

    The likelihood and the latent model typically have hyperparameters that control their behavior.

    They can include:

    • Variance of observation noise
    • Dispersion parameter in the negative binomial model
    • Variance of unstructured effects

The second good news!

No matter how complicated is your model, the inlabru workflow is always the same 😃

# Define model components
comps <- component_1(...) + 
  component_2(...) + ...

# Define the model predictor
pred <- non_linear_function(component_1, 
                            component_2, ...)

# Build the observation model
lik <- bru_obs(formula = pred,
               family = ... ,
               data = ... ,
                ...)

# Fit the model
fit <- bru(comps, lik, ...)

The Tokyo rainfall data

One example with time series: Rainfall over 1 mm in the Tokyo area for each calendar day during two years (1983-84) are registered.

The Tokyo rainfall data

One example with time series: Rainfall over 1 mm in the Tokyo area for each calendar day during two years (1983-84) are registered.

The model

Stage 1 The observation model

\[ y_t|\eta_t\sim\text{Bin}(n_t, p_t),\qquad \eta_t = \text{logit}(p_t),\qquad i = 1,\dots,366 \] \[ n_{i} = \left\{ \begin{array}{lr} 1, & \text{for}\; 29\; \text{February}\\ 2, & \text{other days} \end{array}\right. \] \[ y_{i} = \begin{cases} \{0,1\}, & \text{for}\; 29\; \text{February}\\ \{0,1,2\}, & \text{other days} \end{cases} \]

  • the likelihood has no hyperparameters

The model

Stage 1 The observation model

\[ y_t|\eta_t\sim\text{Bin}(n_t, p_t),\qquad \eta_t = \text{logit}(p_t),\qquad i = 1,\dots,366 \]

Stage 2 The latent field \[ \eta_t = \beta_0 + f(\text{time}_t) \]

  • probability of rain on day \(i\) depends on \(x_i\)

  • \(\beta_0\) is an intercept

  • \(f(\text{time}_t)\) is a RW2 model (this is just a smoother). The smoothness is controlled by a hyperparameter \(\tau_f\)

The model

Stage 1 The observation model

\[ y_t|\eta_t\sim\text{Bin}(n_t, p_t),\qquad \eta_t = \text{logit}(p_t),\qquad i = 1,\dots,366 \]

Stage 2 The latent field \[ \eta_t = \beta_0 + f(\text{time}_t) \]

Stage 3 The hyperparameters

  • The structured time effect is controlled by one parameter \(\tau_f\).

  • We assign a prior to \(\tau_f\) to finalize the model.

inlabru for time series

The Model

\[ \begin{aligned} y_t|\eta_t & \sim \text{Binomial}(n_t,p_t)\\ \text{logit}(p_t) = \eta_i & = \beta_0 + f(\text{time}_t) \end{aligned} \]

Tokyo[1:3,]
  y n time
1 0 2    1
2 0 2    2
3 1 2    3

The code

# define model component
cmp =  ~ -1 + beta0(1) + time_effect(time, model = "rw2", cyclic = TRUE)

# define model predictor
eta = y ~ beta0 + time_effect

# build the observation model
lik = bru_obs(formula = eta,
              family = "binomial",
              Ntrials = n,
              data = Tokyo)

# fit the model
fit = bru(cmp, lik)

inlabru for time series

The Model

\[ \begin{aligned} y_t|\eta_t & \sim \text{Binomial}(n_t,p_t)\\ \text{logit}(p_t) = \eta_i & = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{f(\text{time}_t)}} \end{aligned} \]

Tokyo[1:3,]
  y n time
1 0 2    1
2 0 2    2
3 1 2    3

The code

# define model component
cmp =  ~ -1 + beta0(1) + time_effect(time, model = "rw2", cyclic = TRUE)

# define model predictor
eta = y ~ beta0 + time_effect

# build the observation model
lik = bru_obs(formula = eta,
              family = "binomial",
              Ntrials = n,
              data = Tokyo)

# fit the model
fit = bru(cmp, lik)

inlabru for time series

The Model

\[ \begin{aligned} y_t|\eta_t & \sim \text{Binomial}(n_t,p_t)\\ \text{logit}(p_t) = \color{red}{\boxed{\eta_i}} & = \color{red}{\boxed{\beta_0 + f(\text{time}_t)}} \end{aligned} \]

Tokyo[1:3,]
  y n time
1 0 2    1
2 0 2    2
3 1 2    3

The code

# define model component
cmp =  ~ -1 + beta0(1) + time_effect(time, model = "rw2", cyclic = TRUE)

# define model predictor
eta = y ~ beta0 + time_effect

# build the observation model
lik = bru_obs(formula = eta,
              family = "binomial",
              Ntrials = n,
              data = Tokyo)

# fit the model
fit = bru(cmp, lik)

inlabru for time series

The Model

\[ \begin{aligned} \color{red}{\boxed{y_t|\eta_t}} & \color{red}{\boxed{\sim \text{Binomial}(n_t,p_t)}}\\ \text{logit}(p_t) = \eta_i & = \beta_0 + f(\text{time}_t) \end{aligned} \]

Tokyo[1:3,]
  y n time
1 0 2    1
2 0 2    2
3 1 2    3

The code

# define model component
cmp =  ~ -1 + beta0(1) + time_effect(time, model = "rw2", cyclic = TRUE)

# define model predictor
eta = y ~ beta0 + time_effect

# build the observation model
lik = bru_obs(formula = eta,
              family = "binomial",
              Ntrials = n,
              data = Tokyo)

# fit the model
fit = bru(cmp, lik)

inlabru for time series

The Model

\[ \begin{aligned} y_t|\eta_t & \sim \text{Binomial}(n_t,p_t)\\ \text{logit}(p_t) = \eta_i & = \beta_0 + f(\text{time}_t) \end{aligned} \]

Tokyo[1:3,]
  y n time
1 0 2    1
2 0 2    2
3 1 2    3

The code

# define model component
cmp =  ~ -1 + beta0(1) + time_effect(time, model = "rw2", cyclic = TRUE)

# define model predictor
eta = y ~ beta0 + time_effect

# build the observation model
lik = bru_obs(formula = eta,
              family = "binomial",
              Ntrials = n,
              data = Tokyo)

# fit the model
fit = bru(cmp, lik)

Example: disease mapping

We observed larynx cancer mortality counts for males in 544 district of Germany from 1986 to 1990 and want to make a model.

  • \(y_i\): The count at location \(i\).

  • \(E_i\): An offset; expected number of cases in district \(i\).

  • \(c_i\): A covariate (level of smoking consumption) at \(i\)

  • \(\boldsymbol{s}_i\): spatial location \(i\) .

Bayesian disease mapping

  • Stage 1: We assume the responses are Poisson distributed: \[ y_i \mid \eta_i \sim \text{Poisson}(E_i\exp(\eta_i))) \]
  • Stage 2: \(\eta_i\) is a linear function of three components: an intercept, a covariate \(c_i\), a spatially structured effect \(\omega\) likelihood by \[ \eta_i = \beta_0 + \beta_1\ c_i + \omega_i \]
  • Stage 3:
    • \(\tau_{\omega}\): Precisions parameter for the random effects

The latent field is \(\boldsymbol{u} = (\beta_0, \beta_1, \omega_1, \omega_2,\ldots, \omega_n)\), the hyperparameters are \(\boldsymbol{\theta} = (\tau_{\omega})\), and must be given a prior.

inlabru for disease mapping

The Model

\[ \begin{aligned} y_i|\eta_t & \sim \text{Poisson}(E_i\lambda_i)\\ \text{log}(\lambda_i) = \eta_i & = \color{red}{\boxed{\beta_0}} + \color{red}{\boxed{\beta_1\ c_i}} + \color{red}{\boxed{\omega_i}} \end{aligned} \]

g = system.file("demodata/germany.graph",
                package="INLA")
Germany[1:3,]
  region         E  Y  x region.struct
1      1  7.965008  8 56             1
2      2 22.836219 22 65             2
3      3 22.094716 19 50             3

The code

# define model component
cmp =  ~ -1 + beta0(1) + beta1(x, model = "linear") +
  space(region, model = "bym2", graph = g)

# define model predictor
eta = Y ~ beta0 + beta1 + space

# build the observation model
lik = bru_obs(formula = eta,
              family = "poisson",
              E = E,
              data = Germany)

# fit the model
fit = bru(cmp, lik)

inlabru for disease mapping

The Model

\[ \begin{aligned} y_i|\eta_t & \sim \text{Poisson}(E_i\lambda_i)\\ \text{log}(\lambda_i) = \color{red}{\boxed{\eta_i}} & = \color{red}{\boxed{\beta_0 + \beta_1\ c_i + \omega_i}} \end{aligned} \]

g = system.file("demodata/germany.graph",
                package="INLA")
Germany[1:3,]
  region         E  Y  x region.struct
1      1  7.965008  8 56             1
2      2 22.836219 22 65             2
3      3 22.094716 19 50             3

The code

# define model component
cmp =  ~ -1 + beta0(1) + beta1(x, model = "linear") +
  space(region, model = "bym2", graph = g)

# define model predictor
eta = Y ~ beta0 + beta1 + space

# build the observation model
lik = bru_obs(formula = eta,
              family = "poisson",
              E = E,
              data = Germany)

# fit the model
fit = bru(cmp, lik)

inlabru for disease mapping

The Model

\[ \begin{aligned} \color{red}{\boxed{y_i|\eta_t}} & \color{red}{\boxed{\sim \text{Poisson}(E_i\lambda_i)}}\\ \text{log}(\lambda_i) = \eta_i & = \beta_0 + \beta_1\ c_i + \omega_i \end{aligned} \]

g = system.file("demodata/germany.graph",
                package="INLA")
Germany[1:3,]
  region         E  Y  x region.struct
1      1  7.965008  8 56             1
2      2 22.836219 22 65             2
3      3 22.094716 19 50             3

The code

# define model component
cmp =  ~ -1 + beta0(1) + beta1(x, model = "linear") +
  space(region, model = "bym2", graph = g)

# define model predictor
eta = Y ~ beta0 + beta1 + space

# build the observation model
lik = bru_obs(formula = eta,
              family = "poisson",
              E = E,
              data = Germany)

# fit the model
fit = bru(cmp, lik)

inlabru for disease mapping

The Model

\[ \begin{aligned} y_i|\eta_t & \sim \text{Poisson}(E_i\lambda_i)\\ \text{log}(\lambda_i) = \eta_i & = \beta_0 + \beta_1\ c_i + \omega_i \end{aligned} \]

g = system.file("demodata/germany.graph",
                package="INLA")
Germany[1:3,]
  region         E  Y  x region.struct
1      1  7.965008  8 56             1
2      2 22.836219 22 65             2
3      3 22.094716 19 50             3

The code

# define model component
cmp =  ~ -1 + beta0(1) + beta1(x, model = "linear") +
  space(region, model = "bym2", graph = g)

# define model predictor
eta = Y ~ beta0 + beta1 + space

# build the observation model
lik = bru_obs(formula = eta,
              family = "poisson",
              E = E,
              data = Germany)

# fit the model
fit = bru(cmp, lik)

Take home message!